Submitted to the Annals of Applied Statistics 



MANIFOLD EMBEDDING FOR CURVE REGISTRATION 

By Chloe DIMEGLIO 

Institut de Mathematiques de Toulouse & Geosys 

AND 

By Jean-Michel Loubes 

Institut de Mathematiques de Toulouse 
AND 

By Elie Maza 
Institut National Polytechnique de Toulouse 

We focus on the problem of finding a good representative of a 
sample of random curves warped from a common pattern /. We first 
prove that such a problem can be moved onto a manifold framework. 
Then, we propose an estimation of the common pattern / based on 
an approximated geodesic distance on a suitable manifold. We then 
compare the proposed method to more classical methods. 



1. Introduction. The outcome of a statistical process is often a sam- 
ple of curves {fi, i = 1, . . . ,m} showing an unknown common structural 
pattern, /, which characterizes the behaviour of the observations. Exam- 
ples are numerous, among others, growth curves analysis in biology and 
medicine, quantitative analysis of microarrays in molecular biology and ge- 
netics, speech signals recognition in engineering, study of expenditure and 
income curves in economics. . . . Hence, among the last decades, there has 
been a growing interest to develop statistical methodologies which enables 
to recover from the observation functions a single " mean curve" that conveys 
all the information of the data. 

A major difficulty comes from the fact that there are both amplitude vari- 
ation (in the y-axis) or phase variation (in the x-axis) which prevent any 
direction extraction of the mean, median, correlations or any other statis- 
tical indices for a standard multivariate procedure such as principal com- 
ponent analysis, and canonical correlations analysis, see Kneip and Gasser 
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[1992] or Ramsay and Silverman [2005] and references therein. Indeed, the 
classical cross-sectional mean does not provide a consistent estimate of the 
function of interest / since it fails to capture the structural characteristics 
in the sample of curves as quoted in Ramsay and Li [1998]. Hence, curve 
registration methods (also called curve alignment, structural averaging, or 
time warping) have been proposed in the statistical literature. We refer to, 
just to name a few, Sakoe and Chiba [1978] in Spoken Word Recognition do- 
main, Kneip and Gasser [1992] for Landmark Registration, Silverman [1995] 
for a functional principal component analysis, Wang and Gasser [1997] for 
Dynamic Time Warping, Ramsay and Li [1998] for Continuous Monotone 
Registration, R0nn [2001] for shifted curves, Liu and Miiller [2004] for func- 
tional convex averaging, Gervini and Gasser [2005] for maximum likelihood 
estimation, Gamboa, Loubes, and Maza [2007] for shifts estimation, James 
[2007] for alignment by moments, and Dupuy, Loubes, and Maza [2011] for 
Structural Expectation estimation. 

This issue is closely related to the problem of finding the mean of ob- 
servations lying in a space with an unknown, non necessarily euclidean, 
underlying geometry. The problem is thus twofold. 

First, the mere definition of the mean should be carefully studied. Indeed, 
let £ = {X\, . . . , X n } be a sample of i.i.d random variables of law X € M. 
where A4 is a submanifold of ffi p . If we denote by d the Euclidean distance 
on W, then the classical sample mean, or Frechet sample mean, defined by 

n 

(1) ju = arg min V" d 2 (Xi,fa 

i=l 

is not always a good representative of the given sample £ , and, obviously, 
of the underlying population. Using the geometry of the manifold, it seems 
natural to replace Criterion (1) by 

n 

fa = arg min V] 5 2 {X^fa 

where S is the geodesic distance on manifold M, giving rise to the in- 
trinsic mean, whose existence and properties are studied, for instance, in 
Bhattacharya and Patrangenaru [2003] . When dealing with functional data, 
we assume that the functions fi can be modeled as variables with values on a 
manifold, and curve registration amounts to considering an intrinsic statistic 
that reflects the behaviour of the data. In the following we will consider, for 
a > 0, 

n 

(2) /I? = arg min Y / 5 Q (X u fa. 

%=\ 
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In particular, for a = 1, we will deal with fij, the intrinsic sample median. 

Second, previous construction relies on the choice of the embedding which 
may not be unique, then the manifold itself and its underlying geodesic dis- 
tance. Actually we only have at hand a sample of random variables which 
are sought to be a discretization of an unobserved manifold. Over the last 
decade, some new technics have been developed to find and compute the 
natural embedding of data onto a manifold and to estimate the correspond- 
ing geodesic distance, see for instance de Silva and Tenenbaum [2003] for 
a review of global (Isomap type) and local (LLE type) procedures, while 
applications have been widely developed, see for instance Pennec [2006]. 

In the following, we will consider an approximation, achieved with a graph 
theory approach inspired by works on manifold learning and dimension re- 
duction [Tenenbaum, de Silva, and Langford, 2000]. We will first show that 
curve registration for parametric transformations can be solved using a man- 
ifold geodesic approximation procedure. Then, we will highlight that this 
enables to recover a mean pattern which conveys the information of a group 
of curves. This pattern is used for curve classification for simulated data 
and real data which consists in predicting a particular landscape using the 
reflectance of the vegetation. 

This article falls into the following parts. Section 2 is devoted to the con- 
struction of the approximated geodesic distance. In Section 3, we describe 
the manifold framework point of view for curve registration. We then ex- 
plain how to estimate a representative of a sample of warped curves. The 
performance of this estimator is then studied in Section 4 using simulated 
data, and in Section 5 with a real data set. Concluding remarks are given in 
Section 6. Proofs are gathered in Section 7. 

2. A graph construction for topology estimation. Let X be a ran- 
dom variable with values in an unknown connected and geodesically com- 
plete Riemannian manifold A4 C MP. We observe an i.i.d sample £ = {Xi E 
A4, i = 1, . . . , n} with distribution X. Set d the Euclidean distance on MP 
and 5 the induced geodesic distance on M. Our aim is to estimate intrinsic 
statistics defined by Equation (2). Since the manifold A4 is unknown, the 
main issue is to estimate the geodesic distance between two points on the 
manifold, that is 5 (Xi,Xj). 

Let be the geodesic path connecting two points Xi and Xj , that is the 
minimum length path on M between points Xi and Xj. Denoting by L (7) 
the length of a given path 7 on A4, we have that 5 (X{,Xj) = L (7^). 

In the Isomap algorithm, Tenenbaum et al. [2000] propose to learn mani- 
fold topology from a graph connecting /c-nearest neighbors for a given integer 
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k. In the same way, our purpose is to approximate the geodesic distance 5 
with a suitable graph connecting nearest neighbors. Our approximation is 
carried out in three steps. Thereafter, we denote gij a path connecting two 
points Xi and Xj on a given graph, and L (gij) the length of such a path. 

Step 1. Consider K = (£,E) the complete Euclidean graph associated to 
sample £, that is the graph made with all the points of the sample £ as 
vertices, and with edges 

E = {{Xi,Xj} , i = 1,... ,n- 1, j = i + 1,... ,n}. 

For an Euclidean graph, the edge weights are the edge lengths, that is, the 
Euclidean distances between each pair of points. 

Step 2. Let T = {£, Et) be the Euclidean Minimum Spanning Tree (EMST) 
associated to K, that is, the spanning tree that minimizes 

{Xi.Xj} (:(■;■, 

The underlying idea in this construction is that, if two points Xi and Xj are 
relatively close, then we have that 5 (Xi,Xj) ~ d (Xi, Xj). This may not be 
true if the manifold is very twisted and if too few points are observed, and 
may induce bad approximations, hence the algorithm will produce a good 
approximation for relatively regular manifolds. It also generally requires a 
large number of sampling points on the manifold in order to guarantee the 
quality of this approximation. This drawback is well known when dealing 
with graph based approximation of the geodesic distance. Then, the graph 
T is a connected graph spanning K which mimics the manifold Ai. Further- 
more, an approximation of the geodesic distance 8(Xi,Xj) is provided by 
the sum of all the euclidean distance of the edges of the shortest path on T 
connecting Xi to Xj, namely 

5(Xi,Xj) = min L(gij). 

However, this approximation is too sensitive to perturbations of the data, 
and hence, very unstable. To cope with this problem, we propose to add 
more edges between the data to add extra paths in the data sample and 
thus to increase stability of the estimator. The idea is that paths which are 
close to the ones selected in the construction of the EMST could provide 
alternate ways of connecting the edges. Close should be here understood as 
lying in balls around the observed points. Hence, these new paths between 
the data are admissible and should be added to the edges of the graph. This 
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provides redundant information but also stabilizes the constructed distance, 
and may also provide an answer to the the main defect of the algorithm that 
considers that two points very close with respect to the Euclidean distance 
are also close with respect to the geodesic distance. 

Step 3. Let B (Xj, ej) Cl p the open ball of center Xi and radius ej defined 
by 

6j = max d (Xi ,Xj). 
Let graph K' = (£, E') defined by 

n 

{Xi,Xj} X~X~ C[JB (Xu ei) 

i=l 

where 



XiXj = {X € R p , 3X € [0, 1], X = XX j + (1 - \)Xi} . 
Then, K' is the graph which gives rise to our estimator of the distance 5 : 

(3) 6 (Xi, Xj) = min L (gij) . 

Hence, 5 is the distance associated with K' , that is, for each pair of points 
Xi and Xj , we have 5 (Xi ,Xj) = L (7^ ) where jij is the minimum length 
path between Xj and Xj associated to K' . 

We note that, the 3-steps procedure described above contains widespread 
graph-based methods to achieve our purpose. In this article, our graph-based 
calculations, such as MST estimation or shortest path calculus, were carried 
out with the R Language [R Development Core Team, 2010] with the igraph 
package for network analysis [Csardi and Nepusz, 2006]. 

An example of this 3-steps procedure and its behaviour when the number 
of observations increases are displayed respectively in Figure 1 and Figure 2. 
In Figure 1, points (Xf,Xf) i are simulated as follows : 

^1 2i — n — 1 1 , o f 2i — n — 1\ 2 9 

4 X} = — + e andAT 2 = 2 — + e 2 

n — 1 \ n — 1 J 

where t\ and e| are normaly distributed with mean and variance 0.01. In 
Figure 2, we give some results of graph K' for n G {10,30,100,300}. We 
can see in such a figure that graph K' tends to be close to the manifold 
{(t,t 2 ) € M 2 , t € R}. 

The main difference between our algorithm and the Isomap algorithm lies 
in the treatment of points which are far from the others. Indeed, the first 
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Fig 1. Construction of a subgraph K' from Simulation (4) with n — 30 points. On the 
top left, a simulated data set. On the top right, the associated complete Euclidean graph 
K (Step 1). On the bottom left, the EMST associated with the complete graph K (Step 2). 
On the bottom right, the associated open balls and the corresponding subgraph K' (Step 
3). 



step of the original Isomap algorithm consists in constructing the ^-nearest 
neighbor graph or the e-nearest neighbor graph for a given integer A: or a real 
e > 0. Hence, points which are not connected to the biggest graph, since they 
are too distant, are not used for the construction of the estimated distance. 
Such a step is not present in our algorithm since in the applications we 
consider a distant point is not always an outlier. Hence, we do not exclude 
any points, and rather, for the construction of the EMST, all points of the 
data set are connected. Moreover, the Isomap algorithm requires the choice 
of parameters which are closely related to the local curvature of the manifold 
(see, for instance, Balasubramanian and Schwartz [2002]). This involves a 
heavy computing phase which is crucial for the quality of the construction, 
while, in our version we tend to give an automatic selection of parameters. 
We will show in Section 3 that both procedures used for curve registration 
behave in a similar way and over performs other standard feature extraction 
methods. 
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n=10 n=30 




Fig 2. Evolution of graph K' for Simulation (4) and n £ {10, 30, 100, 300}. 

In the following section, we present a new application of manifold learning 
to the curve alignment problem. 

3. Application to curve alignment. Consider a function / : R — > R, 
which will be the pattern to be recovered, observed in a translation effect 
framework. Let A be a real valued random variable with unknown distribu- 
tion on an interval (6, c) C M. The observation model is defined by 

(5) Xi = f(tj-Ai), ie{l,...,n}, je{l,...,m}, 

where (Ai)- are i.i.d random variables drawn with distribution A which 
model the unknown translation parameters, while (tj)j € M. m stand for the 
measurement points. 

This situation usually happens when individuals experience similar events, 
which are explained by a common pattern /, and when the starting times 
of the events are not synchronized. Such a model has been studied, for in- 
stance, in Silverman [1995] and in R0nn [2001]. This issue has also received 
a specific attention in a semi-parametric framework in Gamboa et al. [2007] 
or Castillo and Loubes [2009]. In these works, among others, shift parame- 
ters are estimated, which enables to align the observations and thus to get 
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rid of the translation issue. Model (5) falls also under the generic warping 
model proposed in Maza [2006] and in Dupuy et al. [2011] which purpose is 
to estimate the underlying structure of the curves. For this, the authors de- 
fine the structural median /gM of the data. In the case of translation effects, 
it corresponds to 

(6) f SM = f (■ - med(A)) 

with med(A) the median of A. Hence, a natural estimator of the structural 
median /sm 5 related to Model (5), would be 

(7) Ism = (f (ti - med(A)) , f (t 2 - ™d(A)) ,...j(t m - ^d(A) 



with med(^4) the median of sample {Ai)^ However, we first note that the 
translation parameters (^4j)i are n °t observed, and, as a consequence, that 
the median med(^4) can not directly be calculated. Then, the function / is 
also unknown, so, estimating m.ed{A) is not enough to calculate /sm- Our 
purpose is to show that our manifold point of view provides a direct estimate 
of /sm without the prior estimation of m.ed(A). 

In order to use the manifold embedding approach, define 

X : R -> R m 

a ^ X{a) = (f(t 1 -a)J(t 2 -a),...,f(t m -a)) 

and set 

C = {X{a) € R m , a € R} . 

As soon as /' 7^ 0, the map X : a 1— » X(a) provides a natural parametrization 
of C which can thus be seen as a submanifold of R m of dimension 1. The 
corresponding geodesic distance is given by 

ra 2 



S(X( ai ),X(a 2 )) 



ra-2 

/ ||X'(a)||da 

J a\ 



The observation model (5) can be seen as a discretization of the manifold 
C for different values {Ai) i . Finding the median of all the shifted curves 
can hence be done by understanding the geometry of space C, and thus 
approximating the geodesic distance between the curves. 

The following theorem states that the structural median /sm defined by 
Equation (7) is equivalent to the median with respect to the geodesic dis- 
tance on C, that is 



■n 



jlj = arg min y^5{Xj,ix) , 
MeC i=i 

which provides a geometrical interpretation of the structural median. 
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Theorem 1. Under the assumption that f ^ 0, we get that 

Vh = /sm- 

Previous theorem can be extended to the more complex case of parametric 
deformations of the type 

X : R 3 -»• R m 

(a, b, c) (->■ X(a, b, c) = (af (tt - b) + c, . . . ,af (t m - b) + c) 

as soon as a ^ and /' ^ 0. Such a model has been described in Vimond 
[2010] and in Bigot and Loubes [2010]. In this case, the submanifold is ob- 
viously of dimension 3. 

In an even more general framework, when the observations can be modeled 
by a set of curves warped one from another by an unobservable deformation 
process, this estimate enables to recover the main pattern. It relies on the 
assumption that all the data belong to a manifold whose geodesic distance 
can be well approximated by the graph structure of the modified minimal 
spanning tree described in Section 2. 

Finally, we propose the following estimator of the structural median 

n 

(8) fi} = arg mm V" $ (X h fi) , 

i=i 

using the geodesic distance S, estimated by the algorithm described in Sec- 
tion 2. 

The numerical properties of this estimator is studied using simulations in 
Section 4, and for real data sets in Section 5. 

4. Simulations. We consider the target function / : R — >■ R defined by 
f(t) = t sin(i). We simulate deformations of this function on j = 1, . . . , m = 
100 equally distributed points tj of the interval [—10,10], according to the 
following model : 

(9) Yi (tj) = Aif (Bitj - d) , i =, . . . ,n, j = 1, . . . , m, 

where (Ai), and (Ci) i are i.i.d uniform random variables on [—10, 10] while 
(Bi) i is an i.i.d sample of a uniform distribution on [—1, 1]. We finally obtain 
a data set of n = 100 curves where each differs from the initial function / 
by a translation and an amplitude deformation. The data is displayed on 
the left graph of Figure 3. 
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We then consider four estimators of the function /. The first one, which 
minimizes the approximated geodesic distance, defined by Equation (8), will 
be referred to as the structural median estimator. The second one is ob- 
tained by the Curve Alignment by Moments procedure (CAM) developped 
by James [2007]. The third one is the template obtained with the Isomap 
strategy, with the "isomap" function of the R package vegan [Oksanen et al., 
2011]. The last one is the mere mean of the data. 

We recall here that the CAM procedure consists on extracting the mean 
pattern by synchronization of the moments of the simulated curves. For this, 
James [2007] introduces the feature function concept for a given function g, 
defined as I g (t) : 

I g (t) > and J I g (t)dt = 1 

and the moments 

M W = J tl g (t)dt and fi g k) = J (t - /4 1} ) I g (t)dt, k>2. 

Then, the CAM procedure align the curves by warping their moments, for 
instance, the amplitude of the peaks, at the location they occur, the variance 
around these peaks, and so on. This method relies on the choice of a proper 
feature function, for instance Ig\t) = \g^(t)\/ f \g^(s)\ds for a given / > 0, 
on an approximation of the functions by splines, and the selection of the 
number of moments to be synchronized. Hence, it highly depends on the 
choice of these tuning parameters. We have chosen the optimal value of the 
parameters over a grid. 

These four estimators are shown on Figure 3. With the CAM or the 
mere mean procedure, the average curve does not reflect the structure of 
the initial curves, or the amplitude of their variations. On the contrary, the 
structural median extracted by Manifold Warping has the characteristics 
of the closest target curve, but is also its best approximation together with 
the pattern obtained with the Isomap strategy. Note here that our version of 
the algorithm for curves provide the same kind of template and is parameter 
free while parameters governing the dimension of the manifold embedding 
must be chosen for the Isomap procedure. Nevertheless, both procedures are 
competitive and lead to similar performance. 

5. Real data. Consider the real data case where an observation curve 
represents the reflectance of a particular landscape and fully characterizes 
the nature of each landscape. The purpose of this study is to predict the 
different landscapes while observing the reflectance profiles. In Figures 4 
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Fig 3. On the left, a simulated data set of warped curves from Model (9) and an estimation 
of f with the mere mean (white line). On the right, the target function f (red dotted 
line), an estimation of the structural median by Manifold Warping (green solid line), an 
estimation obtained by Isomap (blue dot-dashed line), and an estimation obtained with the 
CAM procedure (black dashed line). 

and 5, we present two data sets corresponding to reflectance patterns of two 
landscapes in the same region with the same period. However, the reflectance 
depends on the vegetation whose growth depends on the weather condition 
and the behavior in soil. It is therefore relevant to consider that these profiles 
are deformations in translation and/or amplitude of a single representative 
function of the reflectance behaviour of each landscape in this region at this 
time. 

Our aim is to build a classification procedure. For this, we will use a 
labeled set of curves and extract from each group of similar landscape a rep- 
resentative profile. Then, we will allocate a new curve to the group whose 
representative curve will be the closest. That is the reason why it is impor- 
tant to obtain a pattern which captures the structure of the curves. We will 
use three different ways to get a representative group of curves, the mean 
curve, the CAM method and our method, referred to as the Manifold Warp- 
ing. We will compare their classification performance together with a usual 
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Fig 4. On the left, the first landscape data. On the right, the CAM representative estima- 
tion (black dashed line) and the Manifod Warping estimation (green solid line). 

classification procedure : the classical /c-nearest neighbours. 

In Figure 4, we observe that the CAM average oversmoothes the peaks 
of activity at times 12 and 22 to make them almost nonexistent. This is 
a clear defect since, according to the experts of landscape remote sensing, 
these peaks of activity are representative of the nature of landscape. Indeed, 
these peaks convey essential informations which determines, among other 
things, the type of landscape. On the other hand, these changes are very 
well rendered by the pattern obtained by Manifold Warping. The same con- 
clusions can be drawn in Figure 5 for an other landscape. In this application 
domain, extracting a curve by Manifold Warping is best able to report data 
as reflecting their structure and thus to obtain a better representative. 

Now, we try to identify "unknown" landscapes by comparing each curve 
to the mean pattern of each group. The allocation rule is built using the Eu- 
clidean distance. Note that here we have sought to classify the landscapes, 
not using the whole curve which would correspond to a whole year of ob- 
servation but using only a part of the curves, namely all the observations 
before t = 30. To benchmark our procedure, we compare our performance 
to the method of the /c-nearest neighbors classification. Finally, we obtain 
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Fig 5. On the left, the second landscape data. On the right, the CAM representative esti- 
mation (black dashed line) and the Manifod Warping estimation (green solid line). 

the confusion matrices displayed in Tables 1 and 2. We get a much better 
discrimination of landscapes with the method consisting in estimating a rep- 
resentative by Manifold Warping than by the CAM method or by classical 
mean. 



Pixel 


Manifold classification 


CAM classification 


Reference 


Landscapel 


Landscape2 


Landscapel 


Landscape2 


Landscapel 


21 





12 


9 


Landscape2 


1 


19 


1 


19 



Table 1 

Manifold Warping and CAM confusion matrices. 



6. Conclusion. By using an Isomap inspired strategy, we have ex- 
tracted from a pattern of curves, a curve which, playing the role of the 
mean, serves as a pattern conveying the information of the data. In some 
cases, in particular when the structure of the deformations entails that the 
curve can be embedded into a manifold regular enough, we have shown that 
this corresponds to finding the structural expectation of the data, devel- 
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Pixel 


Mean classification 


fc-nn classification 


Reference 


Landscapel 


Landscape2 


Landscapel 


Landscape2 


Landscapel 


12 


9 


15 


6 


Landscape2 





20 


2 


18 



Table 2 



Classical mean and k-nearest neighbors confusion matrices. 

oped in Dupuy et al. [2011], which improves the performance of other mean 
extraction methods. This enables to derive a classification strategy that as- 
signs a curve to the group, whose representative curve is the closest, with 
respect to the chosen distance. Of course, the performance of this allocation 
rule deeply relies on the good performance of the pattern extraction. 

One of the major drawbacks of this methodology are that first a high 
number of data are required in order to guarantee a good approximation of 
the geodesic distance at the core of this work. Actually, note that the number 
of observations, i.e the sampling rate of the manifold highly depends on the 
regularity of the manifold such that the assumption that the euclidean path 
between two observations follow approximatively the geodesic path. Hence, 
the data set should be carefully chosen for the manifold to be smooth enough. 
We point out that an enhancement could come from a prior registration 
procedure first applied to the curve and then the manifold warping procedure 
applied to the registered data. 

The second drawback which may also be viewed as an advantage, is the 
following : the extracted pattern is a curve that belong to the observations. 
One the one hand, it may contains noise if the data are noisy observations, 
but on the other hand it thus guarantees that the pattern shares the mean 
properties and specifies of the observations. A solution when the noise must 
be removed is either to directly smooth the resulting pattern or to consider 
the neigbourhood of the extracted pattern with respect to the approximated 
geodesic distance and then use a kernel estimator with these observations 
to obtain a regularized mean curve. 

Nevertheless, we promote this procedure when a large amount of data 
are available and when the sets of similar curves share a common behaviour 
which fully characterizes the observations, coming from an economic, phys- 
ical or biological model for instance. This methods has been applied with 
success to a large amount of cases. Numerical packages for R or Matlab are 
available on request. 

7. Appendix. 
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Proof of Theorem I. Take \i = X(a) with a €]b,c[, we can write 

n 

ft = arg min V^(I(i,),I(a)) 



= are; min > D (A;, a) = arg min C(a) 
i(o)eC^ x(«)ec 

1 = 1 

where D is the following distance on ]6, c[ : 



/ ^'(a^da 



In the following, let (A^. the ordered parameters. That is 

A(i) < A (2) < ••• < A (n) . 
Then, for a given q c[ such that Au\ < a < Aq +1 \, we get that 

j'-i 

C(a) = jl>(a,^ (i) ) + ^lD(%,,4 (i+1) ) 

i=l 

re-1 

+ (n - (a, + ^ (n - i)£> (A (<) , A [i+1) ) . 

i=j+l 

For the sake of simplicity, let n = 2q + 1. It follows that med(A) = A^ q+ iy 
Moreover, let a = A(j\ with j < q + 1. By symmetry, the case j > q + 1 will 
hold. Then, we rewrite C (a) as 

j-i n-l 
C(a) = ^2iD (A {t) ,A {i+1) )+^2(n-i)D(A (i) ,A {i+1) ) 

i=l i=j 

and, by introducing A^ q+1 y we get that 

j'-i 4 
C(a) = + J^*D(A (i) ,i4 (i+1) ) 

i=l i=j 
<3 n— 1 

+ X)(n-2i)D(A (i) ,A (i+1) )+ (n-i)D(A (i) ,A {l+1) ) . 

i=j i=q+l 

Finally, we notice that 

C{a) = C (A {q+1) ) + - 2i)D {A {i) ,A (i+1) ) > C (A (q+1) ) . 
i=j 
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And the result follows since 

ju) = arg min C(a) = X (A (q+1) ) = X (med(A)) = f SM . 

X(a)eC v ' V / 

□ 
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